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Reconstructing complex networks from measurable data is a fundamental problem for understanding and 
controlling collective dynamics of complex networked systems. However, a significant challenge arises when 
we attempt to decode structural information hidden in limited amounts of data accompanied by noise and in the 
presence of inaccessible nodes. Here, we develop a general framework for robust reconstruction of complex 
networks from sparse and noisy data. Specifically, we decompose the task of reconstructing the whole network 
into recovering local structures centered at each node. Thus, the natural sparsity of complex networks ensures 
a conversion from the local structure reconstruction into a sparse signal reconstruction problem that can be 
addressed by using the lasso, a convex optimization method. We apply our method to evolutionary games, 
transportation and communication processes taking place in a variety of model and real complex networks, 
finding that universal high reconstruction accuracy can be achieved from sparse data in spite of noise in time 
series and missing data of partial nodes. Our approach opens new routes to the network reconstruction problem 
and has potential applications in a wide range of fields. 
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Complex networked systems are common in many 
fields yj-l3[]. The need to ascertain collective dynamics of such 
systems to control them is shared among different scientific 
communities . Much evidence has demonstrated that in¬ 
teraction patterns among dynamical elements captured by a 
complex network play deterministic roles in collective dynam¬ 
ics 121]. It is thus imperative to study a complex networked 
system as a whole rather than study each component sepa¬ 
rately to offer a comprehensive understanding of the whole 
system @], However, we are often incapable of directly ac¬ 
cessing network structures; instead, only limited observable 
data are available |@], raising the need for network reconstruc¬ 
tion approaches to uncovering network structures from data. 
Network reconstruction, the inverse problem, is challenging 
because structural information is hidden in measurable data 
in an unknown manner and the solution space of all possible 
structural configurations is of extremely high dimension. So 
far a number of approaches have been proposed to address the 
inverse problem JJ, 5, Iw l 1711 . However, accurate and robust 
reconstruction of large complex networks is still a challenging 
problem, especially given limited measurements disturbed by 
noise and unexpected factors. 

In this letter, we develop a general framework to recon¬ 
cile the contradiction between the robustness of reconstruct¬ 
ing complex networks and limits on our ability to access suf¬ 
ficient amounts of data required by conventional approaches. 
The key lies in converting the network reconstruction prob¬ 
lem into a sparse signal reconstruction problem that can be 
addressed by exploiting the lasso, a convex optimization al¬ 
gorithm 118[ 1190 . In particular, reconstructing the whole net¬ 
work structure can be achieved by inferring local connections 
of each node individually via our framework. The natural 
sparsity of complex networks suggests that on average the 
number of real connections of a node is much less than the 
number of all possible connections, i.e., the size of a net¬ 
work. Thus, to identify direct neighbors of a node from the 
pool of all nodes in a network is analogous to the problem of 


sparse signal reconstruction. By using the lasso that incorpo¬ 
rates both an error control term and an LI-norm, the neighbors 
of each node can be reliably identified from a small amount of 
data that can be much less than the size of a network. The 
Ll-norm, according to the compressed sensing theory [200, 
ensures the sparse data requirement while, simultaneously, 
the error control term ensures the robustness of reconstruc¬ 
tion against noise and missing nodes. The whole network can 
then be assembled by simply matching neighboring sets of all 
nodes. We will validate our reconstruction framework by con¬ 
sidering three representative dynamics, including ultimatum 
games 1211 . transportation El and communications El, tak¬ 
ing place in both homogeneous and heterogeneous networks. 
Our approach opens new routes towards understanding and 
controlling complex networked systems and has implications 
for many social, technical and biological networks. 

We articulate our reconstruction framework by taking ul¬ 
timatum games as a representative example. We then apply 
the framework to the transportation of electrical current and 
communications via sending data packets. 

In evolutionary ultimatum games (UG) on networks, each 
node is occupied by a player. In each round, player i plays 
the UG twice with each of his/her neighbors, both as a pro¬ 
poser and a responder with strategy (pi , qf), where p, denotes 
the amount offered to the other player if i proposes and 


denotes the minimum acceptance level if i responds 124 1251. 
The profit of player i obtained in the game with player j is 
calculated as follows 
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where Pi,Pj £ [0,1]. The payoff < 7 ; of i at a round is the 
sum of all profits from playing UG with i’s neighbors, i.e., 
9i = Yjjd Ti Uij, where T ? ; denotes the set of i’s neigh¬ 
bors. In each round, all participants play the UG with their 
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direct neighbors simultaneously and gain payoffs. Players 
update their strategies (p, q ) in each round by learning from 
one of their neighbors with the highest payoffs. To be con¬ 
crete, player i selects the neighbor with the maximum pay¬ 
off g m ax (t) and takes over the neighbor’s strategy with prob¬ 
ability W{i £- max) = g max (t)/[gi(t) + E3]- 

To better mimic real situations, random mutation rates are in¬ 
cluded in each round: all players adjust their (p. q ) accord¬ 
ing to (pi(t + 1 ),qi(t + 1)) = + (5, qdt) + S), where 

S £ [— £,e] is a small random number [27]. Without loss 
of generality, we set e = 0.05 and p,q £ [0,1]. During 
the evolution of UG, we assume that only the time series of 
(Pi(t), qi(t)) and g^t) (z = 1, • • • , TV) are measurable. 

The network reconstruction can be initiated from the re¬ 
lationship between strategies (pi(t), qi(t)) and payoffs gi(t). 
Note that g^t) = ckjUij, where a nj = 1 if player 

i and j are connected and = 0 otherwise. Moreover, U l3 
is exclusively determined by the strategies of i and j. These 
imply that hidden interactions between i and its neighbors can 
be extracted from the relationship between strategies and pay¬ 
offs, enabling the inference of i’s links based solely on the 
strategies and payoffs. Necessary information for recover¬ 
ing i’s links can be acquired with respect to different time 
t. Specifically, for M accessible time instances fi, • • • , Tm, 
we convert the reconstruction problem into the matrix form 
Y, = ■!>, x X,: 
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where Yj 

6 R Mx1 is the payoff vector of i with j/j(f M ) = 


giitfj) (p = 1, • • • , M), X,; £ R Afxl is the neighboring vec¬ 
tor of i with Xij = dij (j = 1, • • • , TV) and $j £ R MxAr is 

the virtual-payoff matrix of i with <f>ij(t = Uij(t M ). 

Because Uij{t ) is determined by (j?i(t ), qi(t)) and 
( Pj(t),qj(t )) according to Eq. ([TJ, Y, and $j can be collected 
or calculated directly from the time series of strategies and 
payoffs. Our goal is to reconstruct X,; from Y, and <l>, . Note 
that the number of nonzero elements in X ?: , i.e., the number of 
the neighbors of i, is usually much less than length TV of X,;. 
This indicates that X, is sparse, which is ensured by the nat¬ 
ural sparsity of complex networks. An intuitive illustration of 
the reconstruction method is shown in Fig. [T] Thus, the prob¬ 
lem of identifying the neighborhood of i is transformed into 
that of sparse signal reconstruction, which can be addressed 
by using the lasso. 

The lasso is a convex optimization method for solving 


min 

X; 



$ l X i ||2 + A||X i |j 



(3) 


where A is a nonnegative regularization parameter 118. 0. 
The sparsity of the solution is ensured by ||X, ||i in the lasso 
according to the compressed sensing theory [20]. Meanwhile, 



FIG. 1. Illustration of reconstructing the local structure of a node. 
For the red node with three neighbors, #2, $4 and #9 in blue, 
we can establish vector Y and matrix $ in the reconstruction form 
Y = <I>X from data, where vector X captures the neighbors of the 
red node. If the reconstruction is accurate, elements in the 2nd, 4th 
and 9th rows of X corresponding to nodes #2, #4 and #9 will 
be nonzero values (black color), while the other elements are zero 
(white color). The length of X is TV, which is in general much larger 
than the average degree of a node, say, three neighbors, assuring the 
sparsity of X. In a similar fashion, the local structure of each node 
can be recovered from relatively small amounts of data compared to 
the network size by using the lasso. Note that only one set of data is 
used to reconstruct local structures of different nodes, which ensures 
the sparse data requirement. 


2 makes the solution more 


the least square term || Yj — <f>jX. 
robust against noise in time series and missing data of partial 
nodes than would the Li-norm-based optimization method. 

The neighborhood of i is given by the reconstructed vector 
Xj, in which all nonzero elements correspond to direct neigh¬ 
bors of i. In a similar fashion, we construct the reconstruc¬ 
tion equations of all nodes, yielding the neighboring sets of 
all nodes. The whole network can then be assembled by sim¬ 
ply matching the neighborhoods of nodes. Due to the sparsity 
of Xj, it can be reconstructed by using the lasso from a small 
amount of data that are much less than the length of Xj, i.e., 
network size TV. Although we infer the local structure of each 
node separately by constructing its own reconstruction equa¬ 
tion, we only use one set of data sampling in time series. This 
enables a sparse data requirement for recovering the whole 
network. 

We consider current transportation in a network consisting 
of resistors 12211 . The resistance of a resistor between node i 
and j is denoted by r^. If i and j are not directly connected 
by a resistor, ry, = oo. For arbitrary node i, according to 
Kirchhoff’s law, we have 



(4) 


where V and Vj are the voltage at i and j and /j is the to¬ 
tal electrical current at i. To better mimic real power net¬ 
works, alternating current is considered. Specifically, at node 
i, Vi = V sin[(w + Au>i)t\, where the constant V is the volt¬ 
age peak, jj is frequency and Aoj, is perturbation. Without 
loss of generality, we set V = 1, ui = 10 3 and the random 
number A u>i £ [0,20]. Given voltages at nodes and resis¬ 
tances of links, currents at nodes can be calculated according 
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Data=0.1 Data=0.4 Data=1.0 Data=3.0 



FIG. 2. Reconstructed values of elements in vector X for UG on 
small-world networks l(2^1 for different data amounts (a) without 
measurement noise and (b) with Gaussian noise (A/"(0, 0.3 2 )). (c) 
TPR versus FPR and (d) Precision versus Recall for different data 
amounts for UG on WS small-world networks without noise. In (c) 
and (d), the dashed lines represent the results of completely random 
guesses. The network size N is 100, and the average degree (fc) = 6. 
Rewiring probability of small-world networks is 0.3. There are no 
externally inaccessible nodes. The parameter A is set to be 10 -3 . 
We have tested a wide range value of A, finding that optimal recon¬ 
struction performance can be achieved in range [10 -4 ,10 -2 ] and the 
reconstruction performance in the range is insensitive to A. Thus, we 
set A = 10 -3 for all reconstructions. 


to Kirchhoff’s laws at different time constants. We assume 
that only voltages and electrical currents at nodes are measur¬ 
able and our purpose is to reconstruct the resistor network. 
In an analogy with networked ultimatum games, based on 
Eq. 0. we can establish the reconstruction equation Y, = 
<f>i x Xj with respect to time constants t\ 1 ■ ■ ■ ,tM , where 
Utifli) = Ii (f /r) i %ij = 1/f ij and = 

with /r = 1, ■ • • , M and j = 1, • ■ • ,N. Here, if i and j are 
connected by a resistor, Xij = 1/r,j is nonzero; otherwise, 
Xij = 0. Thus, the neighboring vector X, is sparse and can be 
reconstructed by using the lasso from a small amount of data. 
Analogously, the whole network can be recovered by sepa¬ 
rately reconstructing the neighboring vectors of all nodes. 

We propose a simple network model to capture communi¬ 
cations in populations via phones, emails, etc. At each time, 
individual i may contact one of his/her neighbors j according 
to probability Wij by sending data packets. If i and j are not 
connected, = 0. In a period, the total incoming flux /, of 
i can be described as 

N 

f i =’52 w ijfj> (5) 

3 =1 

where fj is the total outgoing flux from j to its neighbors 
in the period and w ij = 1- Equation Q is valid be¬ 


cause of the flux conservation in the network. In the real sit¬ 
uation, fj usually fluctuates with time, providing an indepen¬ 
dent relationship between incoming and outgoing fluxes for 
constructing the reconstruction equation Y; = T, x X,. Here, 
Uiifii) = fi(ffi ) is the total incoming flux of i at time period 
tfj,, 4>ij (tfj ) = is the total outgoing flux of j at time 

period i M , and x , 7 = captures connections between i and 
its neighbors. Given the total incoming and outgoing fluxes of 
nodes that can be measured without the need of any network 
information and communication content, we can as well use 
the lasso to reconstruct the neighboring set of node i and those 
of the other nodes, such that full reconstruction of the whole 
network is achieved from sparse data. 

We simulate ultimatum games, electrical currents and com¬ 
munications on both homogeneous and heterogeneous net¬ 
works, including random [i28il - small-world 12911 and scale- 
free ||30] networks. For the three types of dynamical pro¬ 
cesses, we record strategies and payoffs of players, voltages 
and currents, and incoming and outgoing fluxes at nodes at 
different times, to apply our reconstruction method with re¬ 
spect to different amounts of Data (Data= M/N, where M 
is the number of accessible time instances in the time series). 
Figure[2]shows the results of networked ultimatum games. For 
very small amounts of data, e.g., Data=0.1, links are difficult 
identify because of the mixture of reconstructed elements in 
X, whereas for Data=0.4, there is a vast and clear gap be¬ 
tween actual links and null connections, assuring perfect re¬ 
construction (Fig. |3 a)). Even with strong measurement noise, 
e.g., A/"(0,0.3 2 ), by increasing Data, full reconstruction can 
be still accomplished (Fig. |3b)). We use two standard in¬ 
dices, true positive rate (TPR) versus false positive rate (FPR), 
and Precision versus Recall to measure quantitatively recon¬ 
struction performance El (see |Jll| for more details). We 
see that for Data=0.4, both the area under the receiver operat¬ 
ing characteristic curve (AUROC) in TPR vs. FPR (Fig. He)) 
and the area under the precision-recall curve (AUPR) in Pre¬ 
cision vs. Recall (Fig.|3d)) equal 1 , indicating that links and 
null connections can be completely distinguished from each 
other with a certain threshold. Because high reconstruction 
accuracy can always be achieved, we explore the minimum 
data for assuring 0.95 AUROC and AUPR simultaneously for 
different types of dynamics and networks. As displayed in 
Table [I] with little measurement noise and a small fraction of 
inaccessible nodes, only a small amount of data are required, 
especially for large networks, e.g., N = 1000. In the presence 
of strong noise and a large fraction of missing nodes, high ac¬ 
curacy can be still achieved from a relatively larger amount 
of data. We have also tested our method on several empirical 
networks /Table HT1>. finding that only sparse data are required 
for full reconstruction as well. These results demonstrate that 
our general approach offers robust reconstruction of complex 
networks from sparse data. 

In conclusion, we develop a general framework to recon¬ 
struct complex networks with great robustness from sparse 
data that in general can be much less than network sizes. 
The key to our method lies in decomposing the task of re- 
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TABLE I. Minimum data for achieving at least 0.95 AUROC and AUPR simultaneously for three types of dynamics, UG, current transportation 
and communications in combination with three types of networks, random (ER), small-world (SW) and scale-free (SF). Here, N is network 
size, ( k } is average degree, a is the variance of Gaussian noise, and n m is the proportion of externally inaccessible nodes whose data are 
missing. Data denote the amount of data divided by network size. The results are obtained by averaging over 10 independent realizations. 
RN denotes resistor network, and CN denotes communication network. More details of the reconstruction performance as a function of data 
amount for different cases can be found in (33il. 


N 

(k) 

a 

Tim 

UG 

RN 

(ER/SW/SF) 

CN 


6 

0 

0 

0.38/0.36/0.41 

0.28/0.25/0.32 

0.30/0.28/0.30 


6 

0.05 

0 

0.44 / 0.43 / 0.47 

0.29/0.26/0.37 

0.34/0.31/0.34 


6 

0.3 

0 

1.68/ 1.75/ 1.60 

0.32/0.29/0.38 

1.72/1.81 / 1.80 

100 

6 

0 

0.05 

0.61/0.55/0.64 

1.61/1.65/1.60 

1.33/1.19/1.32 


6 

0 

0.3 

2.33/2.03/2.14 

5.74/8.51/8.50 

5.38 / 6.23 / 6.20 


12 

0 

0 

0.46 / 0.47 / 0.52 

0.37 / 0 / 35 / 0.42 

0.42 / 0.40 / 0.42 


18 

0 

0 

0.53 /0.53/0.58 

0.44 / 0.44 / 0.50 

0.50/0.50/0.50 

500 

6 

0 

0 

0.120/0.116/0.132 

0.094/0.080/0.120 

0.094/0.088/0.100 

1000 

6 

0 

0 

0.071 /0.068 / 0.078 

0.058 / 0.049 / 0.079 

0.055 / 0.050 / 0.055 


TABLE II. Minimum data for achieving at least 0.95 AUROC and 
AUPR simultaneously for UG, RN and CN in combination with sev¬ 
eral real networks. The variables have the same meanings as in Ta- 
blcU See Q for more details. 



Networks 

N 

(k) 

Data 


Karate 

34 

4.6 

0.69 

UG 

Dolphins 

62 

5.1 

0.50 


Netscience 

1589 

3.5 

0.07 


IEEE39BUS 

39 

2.4 

0.33 

RN 

IEEE118BUS 

118 

3.0 

0.23 


IEEE300BUS 

300 

2.7 

0.10 


Football 

115 

10.7 

0.35 

CN 

Jazz 

198 

27.7 

0.49 


Email 

1133 

9.6 

0.10 


constructing the whole network into inferring local connec¬ 
tions of nodes individually. Due to the natural sparsity of 
complex networks, recovering local structures from time se¬ 
ries can be converted into a sparse signal reconstruction prob¬ 
lem that can be resolved by using the lasso, in which both the 
error control term and the LI-norm jointly enable robust re¬ 
construction from sparse data. Insofar as all local structures 
are ascertained, the whole network can be assembled by sim¬ 
ply matching them. Our method has been validated by the 
combinations of three representative dynamical processes and 
a variety of model and real networks with noise and inacces¬ 
sible nodes. High reconstruction accuracy can be achieved for 
all cases from relatively small amounts of data. 

It is noteworthy that our reconstruction framework is quite 
flexible and not limited to the networked systems considered 
here. The crucial issue is to find a certain relationship between 
local structures and measurable data to construct the recon¬ 
struction form Y = <I>X. Indeed, there is no general manner 


to establish the reconstruction form for different networked 
systems, implying that the application scope of our approach 
is yet not completely known. Nevertheless, our method could 
have broad applications in many fields due to its sparse data 
requirement and its advantages in robustness against noise and 
missing information. In addition, network reconstruction al¬ 
lows us to infer intrinsic nodal dynamics from time series 
by canceling the influence from neighbors 0, although this 
is beyond our current scope. Taken together, our approach 
offers deeper understanding of complex networked systems 
from observable data and has potential applications in predict¬ 
ing and controlling collective dynamics of complex systems, 
especially when we encounter explosive growth of data in the 
information era. 


* wenxuwang@bnu.edu.cn 
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